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Executive  Summary 


Background. 


This  report  and  two  companion  reports  have  been 
derived  from  the  chapter  on  detection  theory  prepared 
for  the  High  Gain  Initiative  (HGI)  report.  The  two 
companion  reports,  "Detection  Processing  for 
Undersea  Surveillance"  (Confidential)  and  "Adaptive 
Locally  Optimum  Processing  for  Interference 
Suppression  from  Communication  and  Undersea 
Surveillance  Signals"  focus  on  detection  processing. 
"Detection  Processing  for  Undersea  Surveillance" 
summarizes  the  status  of  detection  processing  for 
undersea  surveillance  at  the  time  of  the  initiation  of  the 
HGI  program  and  deals  primarily  with  adaptive 
filtering.  It  summarizes  the  cumulation  of  many  years 
of  effort  in  the  field  by  J.  Zeidler  and  others.  "Adaptive 
Locally  Optimum  Processing  for  Interference 
Suppression  from  Communication  and  Undersea 
Surveillance  Signals"  is  the  cumulation  of  work  by  J. 
Bond,  S.  Hui,  D.  Stein,  and  others  on  adaptive  locally 
optimum  processing  for  interference  suppression  and 
of  V.  Broman  on  target  tracking.  This  report 
summarizes  the  results  of  statistical  analyzes  of 
power  variations  due  to  target  motion  obtained  through 
simulations  and  actual  data  power  variations  collected 
during  a  HGI  experiment.  The  analysis  involved  fitting 
the  power  variation  data  by  Gaussian  mixture  models. 
This  work  was  primarily  accomplished  by  D.  Stein  for 
the  HGI  program. 

Introduction. 


Gaussian  mixture  models,  and  a  particular  class  of 
mixture  models  known  as  Middleton  Class  A  Noise 
Models,  have  been  widely  investigated  to  model  the 
acoustic  noise  generated  by  distant  shipping.  In  this 
report,  we  discuss  the  use  of  Gaussian  mixture 
rmxjels  to  describe  the  noise  generated  by  a  nearby 
ship.  The  investigation  consists  of  an  analysis  of 
selected  HGI  hydrophone  experimental  data  to 
determine  its  statistical  characteristics  and  of 
simulations  of  the  power  variations  expected  at  a 
hydrophone  within  the  deep  sound  channel  due  to  the 
motion  of  a  source  relative  to  the  hydrophone. 

Summary  of  Results. 


The  following  results  were  obtained  for  three  half-hour 
segments  of  experimental  hydrophone  data  selected 


(a)  the  complex  samples  generated  from  the 
data  from  one  of  the  segments  was  better  fit  by  a 
two-state  Gaussian  mixture  model  than  by  a  Gaussian 
distribution. 

(b)  the  distributions  of  the  complex  samples 
for  two  of  the  segments  were  not  significantly  different 
from  Gaussian  distributions,  and 

(c)  the  phase  could  be  modeled  as  uniform  on 
(-JC,  Jt]  and  the  real  and  imaginary  components  of  the 
complex  samples  modeled  as  independent  for  all  three 
segments. 

The  simulations  of  power  variations  due  to  target 
motion  led  to  statistics  which  were  better  modeled  by 
either  a  two-state  or  three-state  Gaussian  mixture 
model  than  by  a  Gaussian  distribution.  Often  the  data 
was  nearly  as  well  fit  by  a  two-state  model  as  by  a 
three-state  model.  Inadditbn,  the  two-state  models 
exhibited  consistency  over  a  track  traversing  multiple 
convergence  zones  and  the  model  parameters  formed 
two  clusters. 


Nearby  shipping  can  sometimes  be  well  modeled  by  a 
Gaussian  mixture  model.  The  mixture  nature  of  the 
statistics  can  be  attributed  to  changing  nxxJal 
interactions  due  to  target  motion. 


Information  processing  for  ocean  basin  surveillance 
requires  the  detection  of  submarine  lines  in  the 
presence  of  interference.  The  interference  can  be 
broadband  or  narrowband,  originating  from  surface 
ship  traffic,  marine  life,  or  ocean  waves.  The 
optimum  detection  processing  is  known  when  the 
interference  can  be  well-modeled  as  stationary 
Gaussian  noise.  A  number  of  powerful  processing 
techniques  that  provide  improved  detection 
performance  over  traditional  processing  are  available 
when  the  interference  is  non-Gaussian.  The 
broadband  component  of  ocean  noise  is  usually 


Characterization  of  Interference 
Statistics 

Introduction. 


Conclusions. 
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welt-modeled  by  Gaussian  noise.  The  above 
considerations  led  us  to  focus  our  attention  on  the 
analysis  of  narrowband  interference,  presumably 
arising  from  ships,  to  determine  the  time  scales  over 
which  the  statistics  can  be  modeled  as  stationary  and 
also  to  determine  the  likelihood  that  the  narrowband 
interference  is  non-Gaussian. 

The  Atlantic  and  Pacific  ocean  basins  may  each 
contain  several  thousand  ships,  say  3000,  at  any 
given  time.  Suppose  that  the  high  gain  array  forms 
100  or  more  beams,  so  that  possibly  30  to  60  ships 
are  within  a  horizontal  beam.  For  a  matched  field 
beamfotmer,  the  interferences  of  significant  level 
occur  at  convergence  zone  ranges  from  the  spatial  cell 
of  interest,  reducing  the  number  of  ships  generating 
significant  interference  to  about  one-tenth  of  the 
number  of  ships  in  the  beam.  Furthermore,  the  ships 
are  of  different  types,  having  different  equipments  and 
from  different  countries  with  different  electrical 
systems,  so  that  narrowband  lines  from  only  a  few 
ships,  if  any,  would  be  expected  to  interfere  with  the 
detection  of  a  narrowband  signal  of  interest  in  a 
particular  beamformer  spatial  cell.  Each  of  these 
ship-generated  interfering  lines  would,  in  general,  be 
expected  to  arrive  with  time  varying  power,  because  of 
the  variation  in  propagation  conditions.  On  some 
occasions,  a  particularly  strong  interfering  source  on 
another  beam  may  impact  performance  on  the  beam 
being  considered.  In  any  case,  it  is  reasonable  to 
suppose  that  for  a  high  gain  array,  two  different  cases 
are  likely  to  occur;  first,  no  single  narrowband 
interferer  dominates  the  background  noise,  and 
second,  a  few  narrowband  interferer's  dominate  the 
background  noise  (Heitmeyer,  Davis,  and  Yen,  1985). 
In  the  first  case,  it  is  reasonable  to  expect  that  the 
background  noise  will  be  Gaussian-like.  In  the 
second  case,  it  is  reasonable  to  suppose  that  the 
background  noise  will  be  changing  rapidly  due  to 
changing  propagation  conditions  to  the  array  from 
these  few  dominating  sources  of  the  background 
noise.  A  reasonable  model  for  this  second  case  is  a 
Gaussian  mixture  model;  mixture  models 
(Titterington,  Smith,  and  Mackov,  1985)  are  discussed 
later. 

Consider  the  case  of  a  single  dominating  interferer 
and  a  target  of  interest  as  shown  in  figure  1 .  The 
interference  arises  from  a  ship  whose  acoustic  energy 
couples  into  the  deep  sound  channel  and  whose 
location  is  such  that  the  propagation  paths  connecting 
it  with  the  high  gain  array  passes  through  a  submarine 


of  interest.  Under  these  conditions,  the  signal-to- 
noise  ratio  at  the  output  of  the  beamformer  of  the  high 
gain  array  is  expected  to  be  quite  dynamic.  To  see 
why  this  might  be  the  case,  consider  a  surface  ship 
moving  at  15  knots  through  an  area  whose  cross- 
section  is  determined  by  beam  width  and  by  the  width 
of  a  convergence  zone.  At  10  convergence  zones, 
roughly  300  nautical  miles,  the  convergence  zone 
would  have  a  width  of  about  3  miles,  while  the  arc 
length  associated  with  a  5-degree  beam  (roughly  1/10 
a  radian)  would  be  about  30  miles.  A  merchant  ship 
could  pass  through  the  bearrvconvergent  zone  area  in 
as  little  as  12  minutes  and  as  much  as  2  hours. 

Mohnkern  (1989)  has  studied  the  effects  of  interferer 
motion  on  Bartlett  and  Minimum  Variance 
Distortionless  Response  (MVDR)  beamformers  for 
horizontal  arrays  using  plane  wave  propagation.  For 
Bartiett  beamforming,  he  finds  that  the  time-averaged 
main  lobe  response  is  broadened  and  under  some 
conditions  reduced  and  that  the  nulls  are  blurred.  For 
MVDR  beamforming  he  finds  that  the  main  lobe  is 
similarly  affected  and  that  moving  interferers  are 
associated  with  multiple-rank  submatrices  of  the 
spatial  cross-spectral  matrix.  The  details  depend  on 
the  speed  of  the  interferers,  array  geometry,  and  the 
amount  of  data  used  to  estimate  the  spatial 
cross-spectral  matrix.  In  this  report,  we  present 
simulation  results  to  show  the  impact  of  ship  motion 
on  received  power  at  a  vertical  array  for  selected 
spatial  cells  presumed  io  contain  a  target  of  interest. 

The  motion  of  the  interferers  limits  the  amount  of  data 
available  that  may  be  used  to  estimate  a  spatial 
cross-spectral  matrix  to  a  time  interval  for  which  the 
interference  may  be  modeled  as  stationary. 
Unfortunately,  this  time  interval  tends  to  decrease  with 
increasing  array  size,  while  the  amount  of  data 
required  to  estimate  the  spatial  cross-spatial  matrix 
increases.  For  an  array  of  n  sensors,  the  maximum 
likelihood  estimate  of  the  spatial  cross-spectral  matrix 
requires  2n  independent  identically  distributed  data 
points  from  each  hydrophone  so  that  the  expected 
output  signal-to-noise  ratio  using  estimated  spatial 
cross-spectral  matrix  is  not  less  than  one-half  that 
achieved  from  a  priori  knowledge  of  the  spatial 
cross-spectral  matrix  (Reed,  Mallett,  and  Brennan, 
1974). 

The  motion  of  the  interferers  causes  the  interference 
in  the  beamformer  outputs  to  be  nonstationarity  and 
thus  impacts  detection  processing.  Mixture  models 
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have  been  studied  to  model  nonstationary  interference 
in  communications  and  undersea  surveillance  (Baker 
and  GuaKierotti,  1986, 1988,  to  appear;  Berry.  1981; 
Vastola,  1984;  Powell  and  Wilson,  1989).  For  these 
models,  adaptive  locally  optimum  processing 
techniques  have  been  developed.  The  theory  and 
implementation  of  this  processing  constitutes  the  next 
subsection  of  this  report.  Middleton  (1966,1967,1977, 
1983,1984, 1991)  and  Middleton  and  Spaulding  (1983, 
1986)  pioneered  these  efforts  by  determining 
particular  classes  of  mixture  models  that  are  derived 
from  statistical  models  of  the  source  locations  of  the 
interference  relative  to  the  receiver.  In  these  models, 
the  Middleton  Class  A,  B,  and  C  models,  the  mixture 
parameters  are  related  to  general  statistical  properties 
of  the  interference  sources.  Bouvet  and  Schwartz 
(1988, 1989)  have  fit  broadband  ocean  noise  data  with 
the  Middleton  Class  A  noise  model.  These  efforts 
have  involved  frtting  time  domain  noise  data  by  a 
Gaussian  mixture  model.  In  this  report,  we  extended 
these  efforts  by  investigating  the  suitability  of  modeling 
narrowband  ocean  noise  data  in  real-time  in  the 
frequency  domain  by  Gaussian  mixture  models. 

Gaussian  mixture  models  are  generalizations  of 
Middleton  noise  models.  A  Guassian  mixture  model  is 
defined  as  follows.  Suppose  that  the  indices 

{1,2 . M)  of  the  complex  samples  zi,Z2,  can 

be  partitioned  Into  S  disjoint  sets  Mi, M2,  ...,Ms  with 
the  following  properties: 

(a)  lim  ^  exists  and  equals  Pk>0 

for  k=l,2, ...,  S  with  /n*  the  number  of  the  samples 
with  indices  from  the  set  { 1 , 2, ...,  A/}  in  Af* ; 

(b)  the  real  and  imaginary  components  of  the 

interference  for  the  samples  contained  in  the  set  Mu 
have  identical  zero-mean  distributions  with  variance 
jO*  for  k=  1,2 . S  with  c?  <  Oj  < ...  <  C5  . 

If  these  distributions  are  zero-mean  Gaussian 
distributions,  the  mixture  model  is  called  a  Gaussian 
mixture  model,  which  is  completely  described  by  the 
parameter  set  (pi ,  ...,ps,  •  The  parameter  pk 

is  called  the  k-th  state  probability  and  the  parameter 
<5\  is  called  the  k-th  state  variance.  The  probability 
density  function  for  the  normed  square  of  a  complex 
random  variable  described  by  a  Gaussian  mixture 
model  is  a  sum  of  exponentials 


i=l  Iviol 

Noise  Statistics  for  a  Moving  Interferer. 

in  this  subsection,  we  show  by  simulation  that 
multi-state  Gaussian  mixture  models  can  be  used  to 
describe  interferer  statistics  at  the  output  of  a  vertical 
array  when  the  source  of  the  interference  is  moving 
relative  to  the  array.  For  a  billboard  array,  these 
simulations  address  the  properties  of  interferers 
located  within  the  same  horizontal  beam  as  the  signal 
of  interest.  Similar  results  for  interferer  statistics  at  the 
output  of  a  hydrophone  located  near  the  center  of  the 
deep  sound  channel  are  discussed  in  appendix  B. 

The  applicability  of  a  Gaussian  mixture  model  to 
interference  generated  by  multiple  moving  ships 
follows  readily  from  the  applicability  of  a  Gaussian 
mixture  model  for  interference  generated  by  a  single 
moving  ship.  In  particular,  if  X\  ,X2,  ...,X„  are 
independent  Gaussian  mixture  random  variables  and 
Z —X\  +X2  + ...  "^Xn ,  then 

,  “I  "Ij  IXn 

Pz(t)  =  (27t)“  £  £  ...  Y,  Plk,P'lk^■^■PnkJ>ix), 
*1=1  *2=1 

where 

,  1  2(0^.  -KJ?,  ) 

pfx1  =  — —  e  '*1  “2  "** 


t^]k,+<^lk,+-  +  <^lk„ 


when 


1  ^<4  < 


Ypiki  =  1  and pik.  >  0, 1  <ki< mi,  for  1  <i<m. 

ki 

This  result  follows  by  induction  on  n  using  the  fact  that 
the  distribution  of  the  sum  of  two  zero-mean 
independent  random  variables  is  the  convolution  of 
their  distributions  with  zero  mean  and  variance  the 
sum  of  their  variances. 

Simulations  were  performed  to  approximate  the 
received  power  fluctuations  from  a  moving  continuous 
wave  (23.804  Hz)  interference  source.  Figure  2 
shows  the  interferer-source-receiving  array  geometry 
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for  the  simulations.  The  geometry  is  described  by  an 
interfering  ship  with  horizontal  range  r{t)  from  a 
vertical  array,  an  interfering  surface  ship  modeled  as  a 
discrete  source  at  depth  ds ,  and  a  vertical  array 
consisting  of  hydrophones  at  depth  d.  The  interfering 
ship  source  is  modeled  as  moving  with  fixed  radial 
velocity  away  from  the  array  so  that  r(t)  =ro  +  vt, 
where  time  starts  at  0  and  ro  is  the  initial  distance 
from  interfering  ship  source  to  the  array  and  v  is  its 
radial  velocity.  The  simulations  are  designed  to 
examine  the  properties  of  the  ship  interference  as 
received  by  the  hydrophones  or  for  specified  vertical 
beamformer  spatial  cells.  The  beamformer  spatial 
cells  are  chosen  to  represent  cells  of  interest  for  the 
detectbn  of  a  submerged  submarine. 

A  range  invariant  normal  mode  model  was  used  to 
propagate  radiated  power  from  the  interferer  source  to 
the  receiving  array.  The  inputs  to  this  model  are 
ocean  depth,  water  density  at  the  source  depth  p(ds), 
water  sound  speed  profile,  and  sound  speed  profile  in 
the  sediment.  The  sound  speed  profiles  chosen  for 
the  simulations  are  shown  in  figures  3  and  4  These 
sound  speed  profiles  were  derived  from 
measurements  at  the  MDA  array  site,  and  the  depth  of 
the  ocean  was  chosen  as  5190  meters,  which  is  the 
depth  at  the  MDA  array. 

The  modal  functions  and  horizontal  wave  numbers 
were  calculated  using  Kraken  C  (Porter  1992)  and  the 
pressure  field  is  calculated  using  the  equation 

m 

pm,  d,  ds)  =  -/=}—  -^^m,  d,  ds) , 

Jsim  P(ds) 

where 

Zm,d,ds)='L  Z„{ds)Z„(d)-^e-‘'‘'”'^‘^ 
m=l  Jk„ 

with  Zm  the  m-th  modal  function,  km  the  m-th 
horizontal  wave  number,  and  M  the  number  of  modes. 
In  this  equation,  the  pressure  field  is  represented  by  a 
complex  number.  Ninety-eight  modes  are  used  for 
these  calculations  to  capture  the  significant 
propagation  features. 

To  study  the  short-term  power  fluctuations  at  the 
receiving  array  that  occur  as  an  interference  source 
moves,  the  pressure  field  was  calculated  in  increments 
of  0.065  km  from  a  range  of  10  km  to  a  range  of  876 
km.  The  range  increments  correspond  to  the  distance 


a  ship  with  a  radial  velocity  of  5  meters/second  (10 
knots)  traverses  in  13  seconds.  The  receiving  array 
was  modeled  as  consisting  of  200  hydrophones,  one 
every  10  meters,  extending  from  a  depth  of  10  meters 
to  a  depth  of  2000  meters.  The  length  of  the  vertical 
array  corresponds  to  the  length  of  the  MDA  array. 

The  received  pressures  from  the  moving  interference 
source  at  10  meters  depth  were  beamformed  by  a 
Bartlett  beamformer  to  characterize  the  interferer 
power  fluctuations  at  the  output  of  the  beamformer  for 
a  moving  interferer.  A  beamformer  spatial  cell  is 
characterized  by  the  depth  of  the  cell  and  its  range 
from  the  vertical  array  of  hydrophones.  Results  were 
obtained  for  four  spatial  cells  at  a  depth  of  100  meters 
and  ranges  of  434,  450,  464,  and  470  kilometers. 
These  cells  are  located  between  convergence  zones  7 
and  8  at  approximately  434  and  496  kilometers.  For 
each  cell,  the  interference  power  was  calculated  as 


f 

2\ 

Vr*Vo 

V 

Vo  •Vq 

> 

with  Vo  the  steering  vector  for  the  spatial  cell  and  Vr 
the  steering  vector  for  the  interferer  at  range  r  and 
depth  10  meters.  In  particular, 

Vo  =  (pm,  d  1 , ds),pm, diyds),  ...,pm, d2w,ds\ 
and 

Vr  =  (pm,di,d,),pm,d2,dt),  ...,pm,d2w,dt)\ 
with 

dk  =  lO-F  10(^-  1)  meters  for  1,2,  ...,200 
ds  =  \0  meters,  and 
rf,  =  100  meters. 

The  Bartlett  beamformer  results  are  presented  in 
figures  5a,  b,  c,  and  d.  Beam  patterns  with  distinct 
peaks  at  roughly  convergence  zone  spacing  (figures 
5a,  b,  and  c)  occurred  if  the  range  of  the  cell  was 
within  30  kilometers  of  a  convergence  zone,  while  a 
more  complicated  pattern,  as  shown  in  figure  5d,  was 
produced  for  other  cells. 

The  observed  power  at  the  hydrophone  level  and  at 
the  output  of  a  Bartlett  beamformer  attributable  to  a 
moving  interferer  is  likely  to  be  nonstationary  on  a 
timescale  greater  than  the  time  required  by  the  source 
to  move  about  10  kilometers  (one-sixth  of  a 
convergence  zone).  The  power  may  vary  by  more 
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than  20  dB  over  hatf-hour  segments.  The  periods  of 
the  oscillation  are  variable  and  depend  upon  the  range 
of  the  interferer  from  the  array,  the  position  of  the  ship 
relative  to  convergence  zones,  and  the  array 
configuration.  Note  that  figure  5  presents  the  power 
fluctuations  due  to  a  single  moving  source.  For  real 
data,  the  noise  power  fluctuations  would  probably  be 
reduced  by  the  presence  of  background  noise.  As  a 
result,  the  dynamic  range  of  the  noise  and,  in 
particular,  the  ratio  of  high-state  variance  to  low-state 
variance  for  a  Gaussian  mixture  model  depends  on 
the  interference-to-background  noise  ratio.  It  follows 
that  the  dynamic  range  of  actual  beamformed  data 
would  be  less  than  the  estimates  obtained  from  our 
simulation  results  for  a  vertical  array. 

The  predicted  beamformed  interferer  power  levels 
from  the  moving  interferer  source  at  a  depth  of  10 
meters  were  also  used  to  assess  as  a  function  of  time 
whether  the  power  levels  could  better  be  described  as 
arising  from  a  one-state  or  two-state  Gaussian  mixture 
model.  (We  extend  the  use  of  the  mixture  model 
terminol^y  to  include  a  one-state  Gaussian  mixture 
model,  by  which  we  mean  a  Rayleigh  distribution.)  To 
simulate  a  Rayleigh  channel,  the  amplitudes  were 
multiplied  by  independent  unit  variance  complex 
Gaussian  random  variables.  In  particular,  if 
{A  1  ,>^2,  ...,Ah}  is  a  set  of  ampliiudes  ,  the  data 
subject  to  statistical  analysis  were 
{zi  =AiCuZ2  =A2C2,...,Zn  =A„c„},  Where 
Ci,C2,  — >Cn  are  independent  complex  random 
variables  with  a  zero-mean  unit  variance  circular 
Gaussian  distribution.  As  a  result,  the  data  are  locally 
circular  Gaussian  and  the  resulting  assessment 
determines  if  the  amplitude  fluctuations  are  best 
captured  by  a  single-state  or  a  multiple-state  Gaussian 
mixture  model. 

Statistical  analysis  was  performed  for  four  spatial  cells 
as  the  source  moved  from  350  to  450  kilometers  from 
the  array.  Similar  results  for  a  source  moving  away 
from  a  hydrophone  are  presented  in  appendix  B. 
Models  were  fitted  using  the  Expectation  and 
Maximize  (EM)  algorithm  (Zabin  and  Poor,  1989, 

1990, 1991;  Powell  and  Wilson,  1989),  as  described  in 
appendix  A,  to  128  successive  beamformer 
amplitudes  with  50%  overlap.  The  amplitudes  were 
multiplied  by  independent  unit  variance  complex 
Gaussian  random  variables  to  emulate  complex 
samples.  The  results  of  the  statistical  analysis  are 
presented  in  figures  6  through  9  for  spatial  cells  at  a 
depth  of  100  meters  and  ranges  from  the  array  of  434, 


450,  464,  and  470  kilometers,  respectively.  The 
abscissa  is  the  horizontal  distance  between  the 
vertical  array  and  the  first  point  of  each  of  the  data 
sets.  Figures  6a  through  9a  present  the  significance 
levels  of  the  Kendall-Mann  tau  test  for  the  random 
variable  128  sample  average  of  norms  squared  and 
the  Kolmogorov-Smirnov  two-sample  tests  for 
stationarity  by  compa.ing  the  distribution  of  the  first 
half  of  the  samples  with  the  distribution  of  the  second 
half  of  the  samples;  figures  6b  through  9b  show  the 
significance  levels  of  the  Kolmogorov-SmirrKiv 
one-sample  test  for  the  one-state  and  two-state 
Gaussian  mixture  model  distribution  (see  appendix  A 
for  a  description  of  the  statistical  tests).  Figures  6 
through  9  show  the  estimates  of  the  high-state  to 
low-state  variance  and  the  low-state  probability  of  the 
best  two-state  fit  of  the  data  obtained  using  the  EM 
algorithm.  Figure  10  shows  the  joint  probability 
density  function  of  the  low-state  probability  and  the 
ratio  of  the  high-state  to  low-state  variance  for  the 
two-state  mixture  model  parameters  that  best  describe 
the  interference  data  for  the  four  spatial  cells. 

The  multistate  nature  of  the  frequency  domain 
beamformed  interference  is  implied  by  the  low 
significance  levels  shown  in  figures  6b  through  9b  of 
the  test  for  the  one-state  Gaussian  mixture  model 
(significance  levels  were  truncated  at  10“^  and  in 
some  plots  all  values  were  at  or  below  this  value). 
Also,  the  significance  levels  of  the  two-state  Gaussian 
mixture  model  distribution  are  generally  lower  for  the 
beamformed  data  than  for  the  hydrophone  data.  This 
can  be  verified  by  comparing  figures  6  through  9  with 
figure  B-3  of  appendix  B. 

The  dynamic  range  of  the  frequency  domain 
beamformed  data  is  generally  higher  than  for  the 
hydrophone  data.  Generally  speaking,  only  two 
successive  10-kilometer  estimates  are  described  by 
nearly  the  same  mixture  model,  indicating  that  the 
interference  should  be  modeled  by  a  given  Gaussian 
mixture  model  for  periods  of  time  not  exceeding  the 
time  for  the  interferer  to  move  to  or  away  from  the 
receiving  array  by  more  than  30  kilometers. 

The  Kolmogorov-Smimov  one-sample  test  significant 
levels  tended  to  decrease  as  the  variance  ratios 
increased.  See  appendix  A  for  a  description  of  this 
and  other  statistical  tests  used  to  evaluate  model 
distributions.  The  absolute  values  of  the  correlations 
are  0.25,  0.20,  0.38,  and  0.72  for  the  data  in  figures 
11, 12, 13,  and  14,  respectively.  Inspection  of  the 
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cumulative  distributions  suggests  that  often  a  three  or 
more  state  Gaussian  mixture  model  would  fit  the  data 
segments  having  large  dynamic  ranges  better  than  a 
two-state  model.  Examples  of  such  distributions  are 
shown  later  in  this  subsection.  This  observation  led  us 
to  investigate  the  applicability  of  three-state  mixture 
models  to  the  simulated  moving  interferer  data. 

Figures  11  and  12  present  histograms  of  the 
significant  levels  of  one-state,  two-state,  and 
three-state  models  compared  to  the  empirical 
distributions  of  the  data  for  the  Chi-squared  test  and 
the  Kolmogorov-Smimov  one-sample  test, 
respectively.  Figure  13  presents  a  histogram  of  the 
average  differences  (L2  norms)  between  the 
one-state  rrxxlei  and  the  two-state  model,  the 
one-state  model  and  the  three-state  nKxjel,  and  the 
two-state  rrxxlei  and  the  three-state  model.  These 
histograms  reveal  that  the  two-state  and  three-state 
nrxxlels  often  lead  to  distributions  that  differ. 

Figures  14  through  18  show  five  examples  of  the 
one-state  rrxxlei,  two-state  model,  and  three-state 
model  cumulative  probability  functions  compared  to 
the  empirical  cumulative  probability  function  for  the 
beamformed  data.  Figures  14  and  1 5  illustrate  cases 
when  the  two-state  model  fit  and  three-state  model  fit 
to  the  beamformed  data  are  nearly  the  same  and  quite 
different,  respectively.  The  three-state  fit  is  clearly 
better  than  the  two-state  fit  for  the  data  presented  in 
figure  15.  Figure  16  shows  a  worse  case  fit  for  the 
one-state  model  with  the  lowest  significance  level  for 
the  Chi-squared  test.  Note  that  the  tail  of  empirical 
distribution  falls  off  much  slower  than  the  tail  of  the 
one-state  fit  leading  to  a  high  Chi-squared  test  score 
and  correspondingly  low  significance  level.  This 
distribution  is  well  fit  by  the  three-state  model  and  not 
well  fit  by  a  two-state  model.  Figure  17  shows  a 
worst  case  fit  for  the  two-state  model  according  to  the 
Chi-squared  test;  the  three-state  fit,  which  is  nearly 
the  same  as  the  two-state  fit,  is  also  poor.  Figure  17 
illustrates  a  distribution  that  requires  more  than  three 
states  to  be  well  fit.  Figure  18  presents  the  data  for  a 
worst  case  three-state  fit  according  to  the 
Kolmogorov-Smimov  one-sample  test.  These 
examples  illustrate  the  great  variety  of  cases  that 
arose  fitting  the  frequency  domain  beamformed 
interference  levels  from  a  moving  ship. 

Figure  19  summarizes  the  probabilities  of  occurrence 
of  the  three-state  model  parameters  Pl,Pm,Ph, 


— —^obtained  by  using  the  EM  algorithm  for  the 
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beamformed  data.  The  surprising  result  is  the  high 
percentage  of  cases  in  which  the  best  three-state 
rrxxlei  probability  was  between  0.4  and  0.6  as  shown 
in  figure  19b.  Analysis  was  conducted  to  cluster  the 
parameter  data  for  the  three-state  model.  The 
analysis  did  not  reveal  the  strong  clustering  of  model 
parameters  for  the  three-state  rrxxlels  as  exhibited  for 
the  parameters  of  the  two-state  models.  The  best 
defined  cluster  (12  %)  of  joint  rrxxlei  parameters 
occurred  for  low  state  probability  between  0.1  and  0.2, 
the  middle-state  probability  between  0.36  and  0.55, 
the  high-state  to  low-state  variance  between  64  arvj 
2048 ,  and  the  high-state  to  middle-state  variance 
between  4  and  64  .  The  next  best  defined  cluster 
(9  %)  of  joint  rrxxlei  parameters  occurred  for  low-state 
probability  between  0.2  and  0.3,  the  middle-state 
probability  between  0.3  and  0.6,  the  high-state  to 
low-state  variance  between  64  and  1028,  and  the 
high-state  to  middle-state  variance  between  4  and  32. 

The  simulations  suggest  that  a  multi-state  Gaussian 
mixture  model  may  better  characterize  the  interferer 
power  fluctuations  over  a  half  hour  than  does  a 
one-state  model.  The  simulattons  for  a  hydrophone 
(described  in  appendix  B)  and  a  vertical  array  indicate 
that  mixture  models  should  apply  to  both,  with  the  ratio 
of  higfvstate  to  low-state  variance  greater  for  the 
beamformed  output  interference  power  than  for  the 
hydrophone  output  interference  power. 

MDA  Noise  Statistics. 

Frequency  domain  statistics  were  obtained  by 
spectrally  processing  selected  segments  of 
hydrophone  data  collected  during  the  MDA 
experiment.  Three  half-hour  segments  of  hydrophone 
data  for  specific  Fourier  transform  frequency  bins 
^ntaining  dominant  narrowband  lines,  presumably 
ship-generated,  were  selected  for  detailed  analysis. 

In  addition,  the  data  were  surveyed  for  all  frequencies 
to  establish  the  frequency  of  occurrence  of  bins 
exhibiting  two-state  Gaussian  mixture  characteristics 
and  to  determine  the  correlation  of  statistics  between 
adjacent  frequency  bins.  Hydrophone  data  were  used 
for  the  analysis  because  these  data  are  easier  to 
survey  than  beamformed  data  for  the  presence  of 
dominant  narrowband  interference. 


The  selected  hydrophone  data  were  Fourier 
transformed  with  a  frequency  resolution  that 
represents  a  reasonable  choice  for  the  resolution  of  a 
matched  field  beamformer.  The  results  obtained  for 
hydrophone  data  should  be  relevant  because  matched 
field  beamforming  is  a  linear  process.  In  addition,  the 
simulations  presented  in  the  previous  subsection  and 
appendix  B  indicated  that  Gaussian  mixture 
characteristics,  albeit  with  slightly  different  mixture 
parameters,  should  be  observed  for  hydrophone 
outputs  as  well  as  for  the  beamformer  outputs. 

The  three  half-hour  segments  of  MDA  data  were 
collected  on  day  193  during  the  hours  0140  to  0740. 
These  segments,  hereafter  referred  to  as  segments  A, 
B,  and  C,  were  selected  because  their  data  exhibited 
high  levels  of  narrowband  'nterference  as  shown  in 
figures  20, 21 ,  and  22.  Segment  A  spans  01 40  to 
0210,  segment  B  spans  0440  to  0510,  and  segment  C 
spans  0710  to  0740  zulu  time.  Frequency  bins  with 
spikes  near  24  Hz  were  analyzed  in  detail.  The  bins 
chosen  for  segments  A,  B,  and  C  have  frequencies  of 
23.914,  23.951,  and  23.804  Hz  ,  respectively.  The 
interference  levels  in  these  bins  are  presumably 
dominated  by  acoustic  energy  from  ships.  Note  that 
the  large  spike  seen  at  24  Hz  in  figure  22  is  one  of  the 
signals  generated  as  part  of  the  MOA  experiment. 

The  likely  interferer  sources  for  the  selected  segments 
were  further  characterized  by  beamforming  the 
hydrophone  data  by  using  a  modal  beamformer.  This 
beamformer  processes  the  vertical  component  of  the 
incoming  wavefront  as  described  in  the  previous 
subsection  (Bartlett  beamformer)  and  the  horizontal 
component  as  a  plane  wave.  Figures  23,  24,  and  25 
show  the  sum  of  mode  powers  as  a  function  of 
bearing  for  segment  A  at  a  frequency  of  23.914  Hz, 
for  segment  B  at  a  frequency  of  23.950  Hz,  and  for 
segment  C  at  a  frequency  of  23.804  Hz.  These 
figures  indicate  a  dominant  beam  and  several 
prominent  beamformer  side  lobes.  The  side  lobes 
exist  because  of  the  hydrophone  geometry  of  the 
MDA  receiving  array.  Figure  23  indicates  a  ship  on  a 
bearing  of  -69”,  figure  24  a  ship  on  a  bearing  of  118”, 
and  figure  25  a  ship  on  a  bearing  of  -3”. 

The  hydrophones  chosen  were  the  ones  most 
strongly  ensonified  by  the  selected  narrowband 
sources.  The  chosen  hydrophones  were  21,  8,  and 
1 1 ,  for  segments  A,  B,  and  C,  respectively.  The  time 
series  data  were  collected  at  150  samples  per  second 
and  transformed  by  using  a  4096-point  fast  Fourier 


transform  with  50%  overlap  and  a  Han' ting  window. 
This  transformation  has  a  frequency  resolution  of 
C  *36  Hz  and  results  in  about  130  complex  Fourier 
coefficients  for  each  frequency  for  a  half-hour  of  time 
series  data. 

Figures  26,  27,  and  28  present  scatter  plots  of  the  real 
and  imaginary  components  of  the  complex  Fourier 
coefficients  for  the  selected  frequencies  of  the 
segment  A,  B,  and  C  hydrophone  data,  respectively. 
The  figures  indicate  that  the  real  and  imaginary 
components  of  the  samples  can  be  treated  as 
uncorrelated  random  variables.  Figures  29  and  30 
present  histograms  of  the  amplitudes  of  the  complex 
Fourier  coefticients  and  the  phases  of  the  complex 
sample  phases,  respectively,  for  the  selected 
frequency  data  for  segments  A,  B,  and  C.  It  is  difficult 
to  conclude  from  figure  29  whether  a  given  data  set  is 
best  fit  by  a  one-state  or  multiple-state  Gaussian 
mixture  model.  Figure  30  indicates  that  the  phase  is 
better  modeled  by  a  uniform  distribution  than  by  a 
single  value  or  several  discrete  values  of  phase. 

The  initial  statistical  analysis  of  the  frequency  domain 
hydrophone  data  was  structured  to  determine  if  the 
data  for  the  selected  frequencies  for  segments  A,  B, 
and  C  were  stationary.  The  statistical  tests,  which 
are  described  in  appendix  A,  also  addressed  the 
suitability  of  modeling  the  frequency  domain 
narrowband  interference  samples  by  a  circular 
Gaussian  distribution.  The  results  of  these  tests  are 
summarized  in  table  1 . 

The  results  of  applying  the  Kendall-Mann  tau  tests  to 
A,  B,  and  C  segments  indicate  that  sometimes  the  real 
and  imaginary  component  Fourier  coefficient  means 
and  variances  for  the  selected  interferer  frequency  bin 
seem  to  contain  trends  and  sometimes  not.  In 
general,  the  results  are  not  conclusive,  and  only  two 
tests,  lack  of  trends  in  the  means  of  the  real 
components  of  the  Fourier  coefficients  of  segment  A 
and  lack  of  trends  in  the  means  of  the  imaginary 
Fourier  components  of  segment  C,  had  high 
significance  levels.  Three  times  the  significance  levels 
are  below  0.10,  indicating  trends  in  the  means  of  the 
real  Fourier  coefficients  of  segment  C,  variances  of 
the  real  Fourier  coefficients  of  segment  A,  and 
variances  of  the  real  coefficients  of  segment  C.  All 
these  results  are  explained  by  supposing  that  at  times 
the  narrowband  interference  had  a  frequency  close  to 
that  of  the  center  frequency  of  the  Fourier  bin  for 
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RANDOM  null  HYPOTHESIS  TEST  Significance  Level 

VARIABLE  H.  I  H„  i 


Table  1.  Statistical  Test  Summary  of  Suitability  of  Single  State  Gaussian  Mixture  Model  for  Selected  Data 
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significant  Itvala  not  a^uttad  to  account  for  Uia  aatimatlon  of  paramatan 
*  SIgnMcant  laval  laaa  than,  OOT  ara  racordad  at  0 


which  the  data  are  being  analyzed  and  as  a  result  spectra  of  segment  B.  Thus,  segment  C  is  more  likely 

slowly  changing  Fourier  transform  coefficients  .  to  exhibit  modal  interference  effects  than  segment  B. 


The  Kolmogorov-Smimov  two-sample  test  was  used 
to  test  the  hypothesis  that  the  distributions  of  the  real 
arxj  imaginary  components  of  the  complex  Fourier 
coefficients  were  stationary  by  comparing  the 
distribution  of  the  first  half  of  the  samples  with  the 
distribution  of  the  second  half  of  the  samples.  A  lack 
of  stationarity  is  indicated  for  the  distributions  of  real 
components  of  the  Fourier  coefficients  for  segment  C 
and  for  the  imaginary  components  of  segment  B. 

The  Kolmogorov-Smimov  test  was  also  used  to 
compare  the  empirical  distributions  of  the  real  and 
imaginary  components  of  the  Fourier  coefficients  with 
Gaussian  distributions,  and  the  distributions  of  the 
normed  squares  of  the  complex  samples  with  an 
exponential  distributions.  The  comparisons  of  the 
distributions  with  Gaussian  distributions  resulted  in 
significance  levels  between  0.41  and  0.79  without 
accounting  for  the  estimation  of  the  Gaussian 
distribution  parameters,  which  would  lower  the 
significance  levels.  Thus,  both  the  real  and  imaginary 
components  of  the  Fourier  coefficients  for  all  three 
segments  of  data  are  close  to  Gaussian.  The  most 
striking  result  for  the  comparison  of  the  complex 
Fourier  coefficients  is  that  the  distribution  of  segment 
C  data  is  definitely  not  an  exponential  distribution  of 
norms  squared  (significance  level  0.004),  while 
segments  A  and  B  data  had  significance  levels  of  0.60 
and  0.61,  respectively.  This  result  indicates  that 
segment  C  might  provide  an  example  of  a  narrowband 
interferer  whose  frequency  domain  samples  could  be 
better  modeled  by  a  multistate  Gaussian  mixture 
model  than  a  one-state  Gaussian  mixture  model. 

Consider  figure  31  for  a  suggestive  mechanism  for  the 
data  of  segment  C  being  better  fit  by  a  muttistate 
mixture  model  than  for  a  single  state.  The  results 
presented  in  figure  B-3  of  appendix  B  show  that 
amplitude  oscillations  consistent  with  a  mixture  model 
may  occur  in  hydrophone  data  dominated  by  reception 
of  a  narrowband  signal  from  a  moving  ship  because  of 
the  interaction  between  different  modes.  Figure  31 
presents  the  rtxxle  spectrum  of  the  segment  B  and  C 
data  for  bearings  of  127°  and  -3”,  respectively,  the 
directions  from  which  the  most  power  arrived  for  the 
selected  frequencies  for  the  three  data  sets.  Note 
that  the  modal  structure  of  segment  C  differs  from  that 
of  segment  B  in  that  there  are  two  peaks  in  the  mode 
spectrum  of  segment  C  and  single  peaks  in  the  mode 


Given  that  the  segments  were  fairly  Gaussian-like. 
and  the  earlier  results  that  when  in  doubt  the  best 
mixture  model  is  the  one  with  the  fewest  states,  we 
decided  to  fit  the  selected  hydrophone  frequency 
domain  data  with  two-state  mixture  models.  An 
important  additional  consideration  was  that  the  mixture 
model  parameter  estimation  technique  needed  to  give 
reasonable  results  for  sample  sizes  around  130  . 

The  EM  procedure  was  used  to  estimate  two-state 
Gaussian  mixture  parameters  for  the  selected 
frequency  data  for  segments  A,  B,  and  C.  In  addition, 
the  EM  procedure  was  used  to  fit  the  selected 
frequency  data  for  segment  C.  Table  2  summarizes 
the  two-state  mixture  model  parameters  obtained  in 
this  way  and  table  3  summarizes  the  three-state 
mixture  model  parameters  obtained  in  this  way. 


Table  2.  Two-state  Gaussian  mixture  model 
parameters  for  selected  hydrophone  data. 


A 

Segment 

B 

C 

Pi 

1.00 

1.00 

0.47 

PH 

0.00 

0.00 

0.53 

97 

269 

108 

97 

269 

327 

0 

0 

520 

The  Kolmogorov-Smimov  one-sample  test  and  the 
Chi-squared  test  were  used  to  compare  the  two-state 
mixture  model  distributions  to  the  distributions  of  the 
data  for  segment  and  resulted  in  significance  levels  of 
0.86  and  0  as  compared  with  0.004  and  0  for  the 
exponential  distribution  of  norms  squared.  Thus,  the 
segment  C  data  is  better  fit  by  a  two-state  Gaussian 
mixture  model  than  a  one-state  mixture  model.  The 
data  are  slightly  better  fit,  as  expected,  by  a  three- 
state  mixture  model,  as  indicated  by  an  increase 
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Tabla  3.  Three-state  Gaussian  mixture  model 
parameters  for  hydrophone  data  Segment  C. 


Pl 

0.34 

Pm 

0.61 

Ph 

0.05 

92 

Om 

360 

4 

1998 

of  the  Chi-squared  test  level  of  significance  to  0.005. 
However,  the  Chi-squared  test  result  indicates  the 
segment  C  data  might  be  better  fit  by  a  mixture  mode 
with  more  than  three  states.  These  conclusions  are 
borne  out  by  the  distribution  shown  in  figure  32.  The 
one-state  model  distribution  does  not  fit  either  tail  of 
the  empirical  distribution.  The  two-state  and 
three-state  model  distributions  fit  the  tails  of  the 
empirical  distribution  better  than  the  one-state  model 
distribution,  but  neither  fits  the  middle  of  the  empirical 
distribution  very  well. 

We  briefly  considered  how  close  the  probabilities  given 
in  table  3  would  be  to  those  with  the  relationships 
predicted  by  a  three-state  approximation  to  a 
Middleton  Class  A  noise  model.  Toward  this  end,  we 
found  the  value  of  A  for  which  pi  =  e~‘*,  pM=Ae~'*, 
and p}/  =  \-  e~^  -Ae~^  gives  the  best  least  squares 
fit  to  the  state  probabilities  presented  in  table  3.  The 
result  is  =  .84  for  which  pi  =  .43,  pu  =  .36,  and 
Ph  =  .2\.  Next,  we  fixed  the  low-state  variance  and 
the  total  variance,  and  minimized  the  total  least 
squared  error  as  a  function  of  the  middle-state 
variance  for  the  resulting  distribution.  This  leads  to 

=  92,  olf  =  642,  and  =  372.  If  we  impose  the 
additional  condition  that  the  middle-state  variance  is 
between  the  other  two  variances  we  obtain 

=  92  and  alf  =a^  =  543.  The  best  three-state 
model  fit  is  quite  different  from  a  three-state 
approximation  to  the  Middlleton  Class  A  noise  model. 

Figure  33  summarizes  the  salient  features  of  the 
low-state  membership  function  (low-state  probability) 
for  the  two-state  model  best  fitting  segment  C.  Figure 
33a  shows  that  on  two  occasions,  samples  20  through 
46  and  samples  98  through  130,  all  the  samples  were 


with  high  probability  in  the  low  state,  while  the 
remainder  of  the  time  successive  samples  remain  with 
high  probability  in  the  low  state  for  a  few  states  at  a 
time.  Observe  that  the  ratios  of  high-state  to  low-state 
variance  would  decrease  if  the  data  were  processed 
after  averaging  the  power  over  as  few  as  6  samples. 
Figure  33b  suggests  that  about  80%  of  the  samples 
are  either  assigned  with  reasonable  probability  to 
either  the  low  state  or  the  high  state  of  a  two-state 
model. 

Narrowband  ship  lines  are  likely  to  occupy  adjacent 
frequency  bins  for  the  frequency  resolutions  of  the 
Fourier  transforms  used  to  obtain  the  selected 
frequency  data  for  segments  A,  B,  and  C.  This 
feature  is  of  interest,  because  it  altows  adjacent 
frequency  bin  data  to  be  used  to  construct  a  noise 
model  for  the  interference  of  the  middle  bin  and  in  this 
way  to  obtain  presumably  signal-free  noise  samples. 
To  gain  some  understanding  of  how  applicable 
Gaussian  mixture  models  might  be  to  modeling 
narrowband  interference,  all  frequency  bins  for 
segment  C  data  were  fit  by  a  two-state  model.  The 
results  are  summarized  in  figure  34.  Of  particular 
interest  is  that  several  adjacent  bins  to  that  of  the 
selected  frequency  for  segment  C,  23.804  Hz,  are 
better  fit  by  a  two-state  model  than  a  one-state  model. 
The  significance  levels  presented  in  figure  34b  use  the 
Kolmogorov-Smirnov  one-sample  test  with 
significance  levels  uncorrected  for  parameters 
estimated.  In  particular,  we  call  the  readers  attention 
to  the  data  presented  in  figures  34c  and  d.  Figure  34 
also  indicates  several  other  frequency  bins  for  which 
the  interference  might  better  be  modeled  by  a 
multiple-state  model  than  by  a  one-state  model. 

Summary. 


The  real  and  imaginary  components  of  the  Fourier 
transforms  of  distant  shipping  noise  can  be  modeled 
by  independent  Gaussian  distributions.  Undenvater 
acoustic  interference  generated  by  individual  ships  can 
be  modeled  by  Gaussian  mixture  models. 

Simulations  were  conducted  to  predict  received 
narrowband  energy  from  a  moving  source  by  a 
hydrophone  located  in  the  deep  sound  channel  or  by  a 
vertical  array  located  in  the  deep  sound  channel. 

These  simulations  indicated  that  the  interplay  between 
ship  movement  and  propagation  mode  interaction 
leads  to  the  received  energy  being  better  modeled  by 
a  multiple-state  Gaussian  mixture  model  than  by  a 
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one-state  Gaussian  mixture  model.  Usually  the 
interference  is  well  modeled  by  a  two-state  or 
three-state  Gaussian  mixture  model,  with  rare 
occurrences  of  time  periods  during  which  the 
simulated  data  would  be  better  modeled  by  a 
Gaussian  mixture  model  with  more  than  three  states. 

The  EM  algorithm  was  used  to  obtain  the  best 
estimates  of  the  two-state  and  three-state  parameters 
that  best  fit  the  simulated  hydrophone,  the  simulated 
beamformer  output,  and  the  MDA  hydiophone  output 
interference  data.  The  distributions  obtained  by  using 
the  EM  estimated  parameter  values  were  then 
compared  to  the  empirical  distributions.  The 
consistency  of  the  results  obtained  indicates  that  the 
EM  algorithm  could  be  used  to  obtain  Gaussian 
mixture  model  parameter  estimates  from  real 
hydrophone  or  beamformer  output  data. 

The  two-state  mixture  model  parameters  that  best 
model  the  simulated  beamformer  output  power  levels 
clustered  about  two  parameter  vectors.  The  analysis 
shows  that  two  distinct  two-state  mixture  Gaussian 
power  as  the  ship  traverses  multiple  convergence 


zones.  Three  three-state  model  parameter  values  did 
not  cluster  as  much  as  the  parameter  values  of  the 
two-state  models.  Given  that  related  mixture  models 
were  manifested  in  both  the  hydrophone  and 
beamformed  simulation  data  and  that  MDA 
hydrophone  data  were  more  accessible  than  MDA 
hydrophone  data  processed  through  a  matched  field 
beamformer,  we  decided  to  verify  the  simulation 
results  by  analyzing  three  half-hour  segments  of  MDA 
hydrophone  data.  These  segments  were  chosen  to 
contain  narrowband  data,  presumably  from  a  single 
source.  Analysis  of  the  selected  segments  of  data 
indicated  that  one  of  the  three  was  well  described  by  a 
two-state  Gaussian  mixture  model,  while  the 
remaining  two  exhibited  Gaussian  statistics.  Thus, 
one  of  the  three  selected  MDA  hydrophone  data  sets 
exhibited  Gaussian  mixture  characteristics.  The 
reader  should  not  be  concerned  that  two  of  three  did 
not  exhibit  strong  mixture  characteristics,  for  the 
simulations  only  addressed  ship  interference  for  ship 
ranges  from  the  hydrophone  or  vertical  array  for  which 
there  would  be  significant  modal  interference.  Distant 
ships  would  not  be  expected  to  exhibit  such  modal 
interference. 
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(a)  Stationarity  of  statistics 
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RANGE  (km) 


(b)  Nature  of  model 


RANGE  (km) 

(c)  Variance  ratio  estimates  for  two-state  model 


(d)  Low-state  probability  estirfiates  for  two-state  model 

Figure  6.  Interferer  statistics  as  a  function  of  range  from  a  moving  source  and  receiving  vertical 
array  spatial  cell  at  a  depth  of  100  meters  and  a  range  of  434  kilometers. 
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Figure  7.  Interferer  statistKS  as  a  function  of  range  from  a  moving  source  and  receiving  vertical  array 
spatial  cell  at  a  depth  of  100  meters  and  a  range  of  450  kilometers. 
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Figure  8.  Interferer  statistics  as  a  functiorj  of  range  from  a  moving  source  and  receiving  vertical  array 
spatial  cell  at  a  depth  of  100  meters  and  a  range  of  464  fdlometers. 
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Figure  9.  Interferer  statistics  as  a  function  of  range  from  a  moving  source  and  receiving  vertical  array 
spatial  cell  at  a  depth  of  100  meters  and  a  range  of  470  kilometers. 
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LOW  STATE  PROBABILITY 


Figure  10.  Joint  probability  density  function  ofiow-state  probability  and  state  variance  ratio 
for  simulated  Bariett  beamformer  cells  at  a  depth  of  100  meters  and  a  range  of  434,  450,  464, 
and  470  kilometers. 
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PROBABIUTY  ^  ^  PROBABIUTY 


Igura  1 1.  Mixture  model  Chi-squared 
est  sigrtificance  levels  tor  beamformed  data. 


Figure  12.  Mixture  model  Koirpogorov- 
Smirnov  one-sample  test  significance  levels 
for  beamformed  data. 
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Figure  13.  Differences  between  mixture 
model  probabiBty  densities  for  beamformed 
data. 


Figure  14.  Example  of  two-state  and 
three-state  fits  to  beamformed  data  giving 
nearly  equal  results. 
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^ure  15.  Example  of  a  better  three- 
tate  than  a  two-state  tit  for  beamtormed  data. 


Figure  16.  Example  of  a  worse  one- 

state  fit  to  the  beamtormed  data. 


Figure  17.  Example  of  a  worse  two- 
state  fit  to  the  beamtormed  data. 


Figure  18.  Example  of  a  worse  three- 
state  fit  to  the  beamtormed  data. 
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Figure  19.  Three-state  parameter  Jistributiorts  for 
beamformed  data. 
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19.  (Cent)  Three-state  parameter  distributions  for 
beamformed  data. 


Figure  19.  (Coni)  Three-state  parameter  distributions  for 
beamformed  data. 


{e)  High-slate  to  low-state  variance 

Figure  19.(0001)  Three-slate  parameter  distribution  for  beamformed  data. 
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Figure  20.  Average  power  for  segment  A. 


Figure  21.  Average  power  for  segment  B. 


Figure  22.  Average  power  for  segment  C. 
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Figure  25.  Total  23.804~Hz  mode  power  for  segment  C. 


Amplitude  distribution  for  data  set  Ha 


Figure  29.  Amplitude  distribution  of  fourier  coefficients 
for  selected  frequencies  for  segments  A,  B,  and  C. 


Figure  30.  Phase  distributiort  of  fourier  coefftients 
for  selected  frequencies  for  segments  A,  B.  and  C. 


29 


Rgum  31.  Modal  spectra  of  segments  B  and  C. 
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CUMULATIVE  PROBABILITY 


Figure  32.  Cumulative  probability  functions  comparison  for  segment  C  data. 
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(b)  Low-State  probabilities 


Figure  33.  Low-state  probability  properties  for  two-state  model. 
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APPENDIX  A 

MIXTURE  MODEL  ESTIMATION  AND  COMPARISON 


This  appetxiix  discusses  techniques  that  we  used  to  estimate  the  parameters  of  two-state  and  three-state  mixture 
models  and  the  statistical  techniques  we  used  to  compare  the  estimated  distributions  obtained  with  the  empirical 
distributions  of  the  data. 

Parameter  Estimation  Techniques. 

Four  techniques  for  estimating  the  parameters  of  a  mixture  model  with  a  fixed  finite  number  of  states  were 
considered:  (1)  the  method  of  moments  technique,  (2)  the  minimum  distance  technique,  (3)  the  maximum 
likelihood  technique,  and  (4)  the  expectation  and  maximize  (EM)  technique.  The  EM  technique  was  chosen 
because  it  was  the  only  estimation  technique  of  the  four  that  can  provide  accurate  estimates  of  mixture 
parameters  for  either  a  two-state  or  three-state  mixture  model  given  around  100  samples  (Zabin  and  Poor, 
1989,1990,1991;  Powell  and  Wilson,  1989).  The  references  in  this  appendix  are  listed  at  the  end  of  the  body  of 
this  report. 

The  EM  technique  is  an  indirect  way  of  finding  the  maxima  of  the  likelihood  function  (Dempster,  Laird,  and  Rubin, 
1977).  We  describe  how  it  can  be  used  to  estimate  Gaussian  mixture  parameters.  Given  an  S  state  Gaussian 
mixture  model  and  N  data  samples  {zy  1 1  <y  <  ^ ,  the  main  difficulty  in  estimating  the  mixture  model  parameters 
is  that  the  partitioning  of  samples  zj  by  states  is  unknown.  Let  a  denote  the  S  mixture  model  parameter  set 
{pi  >  <7i  ,P2<  <^2,  <75} ,  which  is  to  be  estimated  from  the  data  samples. 

A  description  of  the  EM  technique  involves  two  likelihood  functions,  an  incomplete  log  likelihood  function 
L({z/}la)  and  a  complete  log  likelihood  function  CI({(Zy,i>)}la)  defined  as  follows: 

L({zy}la)=2lnp(z;la) 

>1 

and 

CL({(zy, Jy)} la)  =  z  lnp((z„5y)la)  , 

where  p(zyla)  is  the  probability  of  Zy  occuring  given  a  and  p((zy,5y)la)  is  the  probability 
of  zy  occuring  in  state  sj,  the  state  containing  it,  given  a. 

The  EM  technique  constructs  a  family  of  estimates  of  the  vector  a : 

ai ,  02, ...,  a„, ... ,  where  a„  =  (p„.i ,  <jIi  ,p„2,  a^.2,  ...,p«.s,  cr^^). 

One  estimate  is  obtained  for  each  initial  set  of  parameters  and  then  the  best  estimate  of  the  parameters  after 
iteration  of  the  EM  process  is  chosen  as  the  estimate  of  the  parameters.  In  particular,  given  the  parameter  vector 
estimate  a.„  the  parameter  estimate  On^i  is  constructed  as  follows: 

Step  1.  Select  a  set  of  initial  parameter  vectors  that  span  reasonable  models  for  the  data  being  fitted. 
(This  step  becomes  increasingly  difficult  as  the  number  of  parameter  values  increases  and  places  a  practical  limit 
on  the  number  of  states  in  the  mixture  model  for  which  the  technique  can  be  used  to  estimate  state  parameters.) 
Choose  a  vector  tti  from  the  set  of  Initial  parameter  vectors  to  be  recursively  refined  to  a  candidate  estimate  of  a 
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step  2.  Given  the  estimate  a„,  n  =  1,2,...,  calculate  the  conditional  probabilities  of  the 


states  given  zj  and  : 

,  p{Zj\sj  =  k,(x„)p„j, 

piSj  =  klzj,  a„)  =  ; - — 

where 


Zl^iPiZj\sj  =  h,a„)p„j, 


pizjlsj  =  A.an)  =  — e  for  1  <  A  <  S. 


27CO 


hM 


Step  3.  Construct  a«+i  by  setting 

P/H-u  =  'Lpisj  =  k\zj, a„)  for  I  <k<S 
^  J=' 

and 

i X  piSj  =  klzj, a„)  for  I  <k<S. 

^  j=\  ^PlH-lJc 

Step  4.  Discontinue  the  estimation  process  after  the  successive  estimates  cease  to  change  significantly, 
draw  a  new  parameter  vector  from  the  set  of  initial  parameter  vectors,  and  start  a  new  estimation  process  to 
refine  it. 


Step  5.  Calculate  L{{Zj}\<x)  for  each  of  the  final  estimates  obtained  by  refining  the  different  initial 
parameter  vectors.  Take  as  the  best  estimate  of  the  parameter  vector  the  final  estimate  which  maximizes 


Lazj)\0L). 


For  a  Gaussian  mixture  rTKxlel  a„4i  is  a  maximum  or  saddle  point  of  the  expected  value  of  the  complete 
likelihood  function 

N  S 

ECLiOLn^x)  =  X  X  ln(p((zy,  Jy  =  A:)laM-i))/?(5>  =  k\zj,  a„). 

/=!  1:=1 

Wu  (1983)  has  shown  under  conditions  that  hold  for  estimating  Gaussian  mixture  models,  that  a  sequence  of 
estimates  of  the  expected  value  of  the  complete  likelihood  function  converges  to  a  saddle  point  or  local  maximum 
of  the  incomplete  likelihood  function  Lia) .  When  this  is  the  case,  a  good  approximation  of  the  best  parameter 
vector  can  be  obtained  by  selecting  the  vector  that  maximizes  the  incomplete  likelihood  function  from  the 
parameter  vectors  obtained  using  the  EM  algorithm  for  a  well-chosen  set  of  initial  parameter  vectors. 


The  above  technique  can  easily  be  applied  to  the  estimation  of  the  best  two-state  Gaussian  mixture  model.  For 
this  case,  the  number  of  parameters  can  be  reduced  in  the  following  way.  The  Fourier  coefficients  {zyjare 
normalized  so  that  the  normalized  coefficients  have  a  mean  norm  squared  value  of  2.  In  addition,  the  search 
need  only  be  made  overp/.,  the  low-state  probability,  and  over  al,  the  low-state  variance,  since  the 

7  1  ~PL^\ 

high-state  probability  Ph  =  ^-Pl  and  the  high-state  variance  ajf  =  — — .  Furthermore,  both  the 

low-state  probability  and  low-state  variance  are  constrained  to  values  between  0  and  1  for  the  normalized  data. 
The  EM  algorithm  can  be  used  to  search  for  the  best  two-state  model  by  initiating  searches  for  each  pair  of 


parameter  values  (p/„<Tz,)  in  {(.1,.1),(.1,.5),(.1,.9),(.5,.1),(.5,.5),(.5,.9),(.9,.1),(.9,.5),(.9,.9)}.  This 
was  the  approach  taken  to  fit  data  by  a  two-state  Gaussian  mixture  model  whenever  the  EM  algorithm  was  used. 
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Computer  simuiations  were  used  to  characterize  the  performance  of  the  EM  algorithm  as  described  to  determine 
two-state  Gaussian  mixture  model  parameters  as  a  function  of  pi  and  p  =  —r.  For  each  pi  in  { 0.01 ,  0.05, 

oi 

0.1,  0.2.  0.4,  0.8,  0.9,  0.95,  0.99},  p  in  {  1.6.  6.25,  25. 100}.and  N  in  {16,  31.  62.  125,  250.  500.  1000}.  100 
sets  of  data  were  generated  and  the  mean  squared  errors  in  the  estimated  parameters  pi  and  al  were 
calculated. 

Figures  A- la  and  1b  present  mean  square  error  estimates  of  pi  and  as  a  function  of  sample  size 
A^for  a  Gaussian  mixture  models  with  two  fairly  distinct  states, />/,  =  0.4  and  p  =  6.25.  These  curves 
indicate  that  even  for  N  =  16,  the  EM  procedure  could  estimate  the  state  parameters  with  less  than  8%  error. 
Figures  A-2a  and  2b  present  mean  square  estimates  for  pi  and  as  a  function  of  pi  for  p  =  6.25 
and  N=  125,  while  figures  A-3a  and  3b  present  them  as  a  function  of  p  for  pi  =  0.4  and  N=  125.  Observe 
that  the  estimation  process  leads  to  reasonable  estimates  (10%  or  less  mean  square  error  in  the  parameter 
estimated)  for  125  samples  provided  the  low-state  probability  is  greater  than  about  0.20.  Fu^jre  A-4,  based  on 
data  compiled  from  all  of  the  simulations  discussed  here,  indicated  that  90%  of  the  time  fewer  than  50  iterations 
were  required  for  convergence. 

The  best  two-state  Gaussian  mixture  model  parameters  can  be  used  to  initiate  a  search  for  the  best  three-state 
Gaussian  mixture  model.  The  two-state  process  results  in  the  parameter  set  {p2x>  lor 

normalized  data  with  variance  1 .  Any  three-state  fit  would  lead  to  a  new  middle  state  with  probability  p^jn  and 
variance  g\j^  wHh  g\j^  ^  ^  G\jf  and  low-state  and  high-state  parameter  set{p3x  ,  p^jH, 

wtth  0  ^PiM  cl  ^  g\j^  ^  where  is  the  minimum  norm  square  of  any  of  the  samples,  and 
g\j^  ^  ^  <T3^  .  The  number  of  initial  parameter  vectors  was  further  reduced  by  choosing  state  probabilities 

and  variances  in  a  manner  consistent  with  the  data. 

Given  G\j^f  with  <  olj^f  <  aljj  and p^jn  with  0  ^pjjn  ^  1  partition  the  samples  into  sets  Si,  Sm,  and 
Sh.  Let  S  denote  the  set  of  samples  from  which  the  model  is  to  be  estimated.  Order  the  norms  of  the  elements 
of  S  from  low  to  high  and  let  Sm  consist  of  the  'l00p^M  of  the  elements  of  S  with  norm  squares  closest  to  . 
Let  Si  consist  of  the  elements  of  S  with  norm  squares  less  than  or  equal  to  the  norm  squares  of  the  elements  of 
Sm  and  let  Sn  consist  of  the  remaining  elements  of  S.  Then  let  p^j^  be  the  number  of  elements  in  Si  divided  by 
the  number  of  elements  in  S  and  let  be  the  average  value  of  the  norm  squares  of  the  elements  in  Si ;  let 
pyji  be  the  number  of  elements  in  Sh  divided  by  the  number  of  elements  in  S  and  let  be  the  average 

value  of  the  norm  squares  of  the  elements  in  Sh.  We  choose  three  values  of  the  middle  state  variance  and 
three  values  of  the  middle  state  probabilities  to  obtain  8  initial  parameter  estimates  to  estimate  three-state 
parameter  vectors  using  the  EM  algorithm  given  a  two-state  model  of  the  data.  The  three  variances  were 

®2,L  ~  ^2j.  and  olj^  =  .2  the  three  low-state  probabilities 

were  0.2,  0.5,  and  0.8.  This  initialization  procedure  was  used  to  estimate  three-state  Gaussian  mixture  model 
parameters  whenever  the  EM  algorithm  was  used  to  obtain  such  estimates. 

Simulations  were  conducted  to  determine  the  ability  of  the  EM  algorithm  to  distinguish  between  a  circular 
Gaussian  distribution  and  a  two-state  Gaussian  mixture  model.  This  was  accomplished  by  determining  the 
distribution  of  the  two-state  mixture  parameters  when  the  input  data  were  drawn  from  a  circular  Gaussian 
distribution.  For  Gaussian  noise,  either  the  probabilities  of  one  of  the  states  should  be  near  zero,  or  the  ratio  of 
the  variances  should  be  near  1.  Figure  A-5a  shows  the  parameters  estimated  for  1280  sets  of  130  samples 
drawn  from  a  zero-mean  unit  variance  circular  Gaussian  distribution.  Figures  A-5b,  c,  and  d  are  the  cumulative 
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distributions  of  the  ratio  of  the  estimated  variances  for  those  points  in  figure  A-5a  satisfying 

0.9,  0  .2  0.8,  and  0.3  ^pi^  0.7,  respectively.  These  plots  can  be  used  to  estimate  certain 

ajf 

joint  probabilities  such  as  P{-—  >  4  and  0.2  <pl^  0.8)  =(0.05)(0. 136)  =  0.0068  for  Gaussian  input. 

Monte  Carlo  studies  were  also  done  to  determine  the  distribution  of  the  two-state  mixture  model  parameters  when 
the  input  data  were  obtained  from  a  sinusoid  in  white  Gaussian  noise.  The  Fourier  transform  of  such  data  in  the 
frequency  bin  containing  the  signal  is  the  sum  of  noise  data,  which  has  a  zero-mean  circular  Gaussian  density 
and  a  complex  sinusoid.  For  different  choices  of  the  amplitude  and  precession  rate  of  the  complex  sinusoid, 

1280  independent  sets  of  130  samples  were  generated,  and  the  EM  algorithm  was  run  on  these  data.  Figures 

A-6a,  b,  and  c  show  the  distribution  of  the  estimated  variance  ratio  -r-  and  low-state  probability  pi  for 

02 

signal-to-noise  ratios  of  1.8, 4.8,  and  7.8  dB,  respectively,  and  a  precession  rate  of  O.Stc  rad/(FFT  sample).  This 
precession  rate  corresponds  to  the  signal  frequency  lying  midway  between  two  center  frequencies  of  the  FFT  and 
using  a  window  with  50%  overlap  in  the  calculation  of  the  FFT.  The  precession  rate  had  very  little  impact  on  the 
distribution  of  the  estimated  parameters  in  this  study. 

Figure  A-5  suggests  that  for  a  set  of  130  samples  having  a  Gaussian  distribution,  the  EM  algorithm  is  unlikely  to 

CTij 

estimate  the  Gaussian  mixture  parameters  pi  and  — :7satisfying  0.2  <pl^  0.8  and  —r  >4.  Figure  A-6 
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suggests  that  the  EM  algorithm  is  unlikely  to  estimate  a  significant  amount  of  mixing  in  successive  Fourier 
coefficients  of  a  sinusoid  in  Gaussian  noise.  Furthermore,  as  the  signal-to-noise  ratio  increases  the  estimated 
parameters  of  the  successive  Fourier  coefficients  of  the  signal-plus-noise  process  stabilize  at  values  that  indicate 
essentially  no  mixing. 

Statistical  Tests  to  Compare  Interference  Models. 

Statistical  tests  were  selected  to  characterize  the  various  features  of  the  complex  samples  of  interference  data, 
in  particular,  statistical  tests  were  selected  to  determine  the  suitability  of  modeling  the  interference  data  statistics 
by  a  spherical  Gaussian  distribution  or  as  a  Gaussian  mixture  distribution.  These  tests  were  used  to  analyze 
simulated  data  and  MDA  hydrophone  data.  This  sectton  briefly  describes  the  statistical  techniques  that  are  used 
in  our  discussion  of  information  processing. 

Both  spherical  Gaussian  and  Gaussian  mixture  models  require  that  the  real  and  imaginary  baseband  sample 
components  are  zero-mean  and  identically  distributed.  The  two  models  are  distinguished  by  the  fact  that  for  a 
Gaussian  model  the  statistics  are  stationary  and  for  a  Gaussian  mixture  model  nonstationary.  Therefore, 
statistical  tests  were  selected  to  evaluate  the  hypotheses  that  the  real  and  imaginary  baseband  sample 
components  of  narrowband  interference  are  stationary,  independent,  and  have  Gaussian  distributions,  and  that 
the  real  and  imaginary  parts  are  independent  of  each  other.  If  these  conditions  hold  and  the  variance  of  the  real 
part  is  equal  to  the  variance  of  the  imaginary  part,  the  probability  distribution  of  the  complex  interference  samples 
is  a  spherical  Gaussian  distribution.  In  addition,  statistical  tests  were  selected  to  determine  whether  the 
interference  data  were  best  described  by  either  a  one-state,  two-state,  or  three-state  Gaussian  mixture  modei. 

The  KendalFMann  tau  test  (Baker,  1976;  Bradley,  1968;  Kendall  and  Gibbons,  1990)  was  used  to  determine  the 
presence  of  trends  in  the  mean  and  variance  of  the  interference  data.  This  test  is  an  application  of  the  Kendall 
rank  correlation  test.  It  is  used  to  evaluate  the  independence  of  two  time  series  of  real  numbers  by  searching  for 
a  relationship  in  the  ordering  by  magnitude  of  the  two  time  series.  Each  time  series  is  assumed  to  consist  of 
independent,  identically  distributed  data.  (The  present  description  assumes  that  each  time  series  is  without  ties, 
i.e.  X,  Xj  and  yt  ^yj  for  i^j.  See  Kendall  and  Gibbons  (1990)  for  the  modifications  necessary  to  cover  the 


case  of  possible  ties.)  The  Kendall  rank  correlation  test  evaluates  the  hy{X)thesjs  that  the  two  time  series 
{Xjl  1  ^i^N]  and  {>',-1 1  ^  i  ^  iV} consist  of  independent  identically  distributed  data. 


The  Kendall  rank  correlation  test  statistic  is  formed  by  ordering  the  pairs  )  according  to  the  magnitude  of 
the  X  coordinate  from  its  smallest  value  to  its  largest  value.  Let  =  {(x,,;yi)l  1  <  i  <  with  Xr  <  X/a 

if  For  each  let  /<a  denote  the  number  of  indices  with  yj^  <  and  T/a  denote  the  number  of 

2(7’—/)  ^  ^ 

indices  with  v/a  >  VjA  .  Then  the  test  statistic  is  S  =  — 7:-,  where  7’=  X  7’, a  and  /  =  X  /,a.  If  the 

MA^-1)  ,A=1  ,A=1 

time  series  {x  j  1 1  <i^N}  and  (j',  1 1  ^i^N}  are  independent,  then  the  disthbution  of  S  is  independent  of  their 

2(2N+5) 

distributions  and  approaches  a  zero-mean  normal  distribution  with  variance 


The  Kendall  rank 


correlation  test  can  be  used  for  sample  sizes  larger  than  30.  The  Kendall-Mann  tau  test  is  applied  to  a  single 
real-time  series  1 1  ^  1  ^  ^  by  setting  Xj  =  i  and  applying  the  Kendall  rank  correlation  test.  Applied  in  this 
manner,  the  KendalFMann  tau  test  detects  trends  in  the  mean  of  the  elements  of  {^,  1 1  <  z  ^  A(}.  Applied  to  the 
real-time  series  { l^,- 1  1 1  ^  z  ^  iV} ,  it  detects  trends  in  the  variances  of  the  elements  of  { ly ,  I  1 1  <  z  <  A/} . 


The  Kolmogorov-Smimov  two-sample  test  was  used  as  an  additional  test  of  the  stationarity  of  the  real  and 
imaginary  interference  data,  and  it  was  used  to  test  the  equality  of  the  distributions  between  the  real  and 
imaginary  components  of  the  complex  valued  interference  samples.  If  {x,  1 1  <i<N}  and  {y ,  1 1  <  z  ^  A/}  are 
two  time  series,  the  two-sample  test  measures  the  equality  of  the  distributions  of  {x,  1 1  <i<N}  and 
{yj  1  ^  z*  ^  A^  based  on  the  maximum  of  the  absolute  value  of  the  difference  between  their  cumulative 
distributions  (Baker,  1976;  Bradley,  1968).  The  resulting  stationarity  test  for  a  real-valued  {x,  l  1  <  z  ^  7^  is  the 

Kolmogorov-Smimov  two-sample  test  applied  to  {x,l  1  ^i<M}  and  {Xi\M+ 1  <  z  <  N},  where  M  is  the 

N 

greatest  integer  less  than  or  equal  to  y.  Assuming  that  the  two  distributions  are  identical,  the  distribution  of  the 

statistic  is  independent  of  the  distribution  of  the  random  variables  and  is  given  by  an  infinite  sum,  which  is 
commonly  appioximated  by  its  first  few  terms  (Press  et  ai..  1988;  Wilks,  1962). 


The  Kolmogorov-Smimov  one-sampie  test  was  used  to  compare  empirical  distributions  with  fixed  distributions, 
e.g.,  the  Gaussian  and  the  exponential  distributions.  The  statistic  is  the  maximum  of  the  absolute  value  of  the 
difference  between  the  hypothetical  cumulative  distribution  and  the  empirical  cumulative  distribution.  The 
distribution  on  the  statistic  is  similar  to  that  of  the  Kolmogorov-Smimov  two-sample  test  (Press  et  al,  1988). 


The  Chi-squared  test  (Press  et  al,  1988)  was  also  used  to  compare  empirical  density  functions  and  nxxtel  density 
functions.  This  test  requires  that  the  data  be  placed  in  bins.  The  statistic  is  formed  by  summing  (over  all  bins)  the 
normalized  square  of  the  difference  between  the  expected  number  of  data  points  in  each  bin,  based  on  the 
theoretical  distribution,  and  the  realized  number  in  each  bin.  The  normalization  factor  is  the  expected  number  of 
data  points  per  bin.  Assuming  that  the  theoretical  distribution  is  correct,  as  the  number  of  samples  goes  to 
infinity,  the  distribution  of  the  statistic  approaches  a  Chi-squared  distribution  on  B-r-1  degrees  of  freedom,  where 
B  is  the  number  of  bins  into  which  the  data  are  divided  and  r  is  the  number  of  parameters  estimated  from  the 
data.  The  level  of  significance  obtained  for  the  Chi-squared  accounts  for  the  number  of  estimated  distribution 
parameters.  The  level  of  significance  obtained  for  the  Kolmogorov-Smimov  one-sample  test  does  not  account  for 
the  number  of  estimated  distribution  parameters;  correctbn  factors  are  available  for  the  Kolmogorov-Smimov 
one-sample  test  for  special  cases  (Stephens,  1974). 


A-5- 


CUMULATIVE  PROBABILITY 


Figure  A-4.  EM  algorithm  convergence. 


Figure  A-5.  Two-state  mixture  model  parameter  estimates  of  Gaussian  noise 

and  their  cumulative  probabBIties  of  occurrence. 
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VALUES  OF  SIMULATED  DATA  SET  3  -S'  VALUES  OF  SIMULATED  DATA  SET  1 


QUANTILES  OF  STANDARD  NORMAL 


QUANTILES  OF  STANDARD  NORMAL 


1.  andr=  1. 


(b)  =  1.A=1.andr=0.S. 


QUANTILES  OF  STANDARD  NORMAL  QUANTILES  OF  STANDARD  NORMAL 


(c)  =  1,  A  =1,  and  r=  0.05.  (d)  =  9,0000,000,  A  =  1.42,  and  r=  0. 127. 


Figure  A-7. 


Normal  scores  plots  for  simulated  Middleton  class  A  noise. 


APPENDIX  B 

Moving  Target  Hydrophone  Interference  Statistics 


In  this  appendix,  we  present  simulation  results  for  a  moving  target  using  the  Kraken  C  (Porter,  1992)  modal 
nKXiei  to  calculate  the  received  interference  power  at  a  hydrophone.  (References  in  this  appendix  are  listed  at 
the  et>d  of  the  body  of  this  report.)  The  geometry  governing  these  simulations  is  that  shown  in  figure  2  of  the  body 
of  this  report  for  a  hydrophone  located  near  the  middle  of  the  deep  sound  channel. 

We  begin  our  investigation  of  the  impact  of  interferer  motion  on  received  interferer  statistics  by  examining  time 
series  of  pressure  amplitudes  for  a  10-meter-deep  interference  source  and  for  a  receiving  hydrophone  of  800 
meters.  The  interference  source  is  above  the  deep  sourtd  channel  and  the  receiving  hydrophone  is  well  within 
the  deep  sourxl  channel  as  can  be  seen  from  an  examination  of  figure  3  of  the  body  of  the  report.  The  pressure 
amplitudes  are  normalized  so  that  0  dB  corresponds  to  the  average  received  power  for  the  time  series  of 
pressure  amplitudes  for  the  interferer  source  at  ranges  between  10  km  and  876  km  from  a  receiving  hydrophone 
at  a  depth  of  800  meters.  The  results  are  presented  in  figure  B-1  for  four  .'elected  range  intervals  to  illustrate  the 
manner  in  which  the  fluctuations  change  as  a  function  of  range.  The  rate  and  the  magnitude  of  the  fluctuations 
depend  on  the  range  of  the  source.  The  fluctuations  at  the  hydrophone  arise  from  modes  of  different  wave 
numbers  beating  against  each  other.  At  longer  ranges,  the  energies  in  the  higher  modes  are  reduced  through 
bottom  interaction,  as  can  be  seen  from  an  examination  of  figure  B-2,  which  presents  mode  amplitudes  for  the 
interference  source  at  different  ranges  from  the  receiving  hydrophone.  As  a  result,  fewer  nxxles  beat  against 
each  other  as  the  range  increases  and  the  fluctuations  tend  to  be  slower  the  more  distant  the  interferer  from  the 
hydrophone. 

Figure  B-3  shows  the  results  of  the  statistical  tests  and  the  estimated  Gaussian  mixture  parameters  for  a  source 
at  a  depth  of  10  meters  and  the  receiving  hydrophone  at  a  depth  of  800  meters.  The  amplitude  of  the  pressure 
data  as  the  source  moves  from  60  to  375  kilometers  was  segmented  into  sets  of  128  contiguous  samples, 
representing  27.5  minutes  of  data  for  a  ship  moving  at  10  knots,  and  the  sets  were  overlapped  by  50%.  The 
abscissa  is  the  horizontal  distance  between  the  hydrophone  and  the  first  point  of  each  of  the  data  sets.  Figure 
B-3a  presents  the  significance  levels  of  the  Kendail-Mann  tau  test  for  the  random  variable  128-sample  average 
of  norms  squared  and  the  Kolmogorov-Smimov  two-sample  tests  for  stationarity  by  comparing  the  distribution  of 
the  first  half  of  the  samples  with  the  distribution  of  the  second  half  of  the  samples.  Figure  B-3b  shows  the 
significaiKe  levels  of  the  Kolmogorov-Smimov  one-sample  test  for  the  one-state  and  the  two-state  Gaussian 
mixture  model  distributions. 

The  significance  levels  for  the  Kolmogorov-Smimov  one-sample  test  shown  in  figure  B-3b  are  based  on  the 
distribution  of  the  test  scores  assuming  that  parameters  have  not  been  estimated.  The  significar  ce  levels  after 
adjustment  for  parameter  estimation  are  lower  than  those  indicated  in  figure  B-3b.  However,  for  the  idealized 
simulation  results  presented  here,  it  was  not  deemed  necessary  to  perform  a  more  careful  and  calculationally 
demanding  statistical  analysis  of  the  fit  between  these  data  and  two-state  Gaussian  mixture  models  based  on  the 
Cramer-Von  Mises  and  Anderson-Darling  goodness-of-fit  tests  with  significance  levels  adjusted  for  parameter 
estimation  (Darling,  1955;  Stephens,  1974,1976;  Sukhatme,  1972). 

A  two-state  Gaussian  mixture  model  is  characterized  by  ratio  of  its  high-state  variance  a]f  to  its  low-state 
variance  and  its  low-state  probability  pi  and  these  are  the  estimated  parameters  for  the  two-state  Gaussian 
mixture  rrxxlel  plotted  in  figures  B-3c  and  d.  Note  that  the  ratio  of  high-state  to  low-state  variance  is  plotted  in 
figure  B-3c  on  a  log  base  2  scale.  The  joint  probability  density  function  for  the  low-state  probability  and  the  ratio  of 
high-state  to  low-state  variance  is  shown  for  the  estimates  for  two-state  Gaussian  mixture  parameter  for  a  moving 
interferer  in  10  kilometer  steps  from  350  to  500  kilometers.  The  selected  data  span  a  convergence  zone  located 
at  372  kilometers. 
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Figure  B-3.  Interferer  statistics  as  a  function  of  range  from  a  moving  source  and  receiving 
hydrophone. 
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Figure  B-3  Interferer  statistics  as  a  function  of  range  from  a  moving  source  and 

receiving  hydrophone  (continued). 


A  single  two-state  mixture  nfKXJel  with  parameters  pt  =  0.5  and  variance  ratio  between  4  and  8  provides  a  good 
fit  to  most  of  the  selected  hydrophone  data  as  shown  in  figure  B-4.  A  second  two-state  mixture  model  occurs 
arourKl  10%  of  the  time  with  a  high-state  to  low-state  variance  ratio  exceeding  32.  Figure  B-3c  indicates  that  the 
higher  variance  ratio  mixture  model  applies  to  at  most  two  successive  samples,  that  to  a  period  of  time  roughly 
corresponding  to  the  period  of  time  required  for  the  interferer  to  move  12  kilometers  toward  the  receiving 
hydrophone. 


Figure  B-3a  suggests  that  the  amplitude  fluctuations  at  the  hydrophone  output  caused  by  the  source  motion  can 
often  be  better  described  by  a  two-state  Gaussian  mixture  model  than  by  a  one-state  model.  However,  the  ratio 
of  high-state  to  low-state  variance  is  only  6  dB  for  the  most  commonly  occurring  mixture  model. 


The  simulated  hydrophone  data  were  also  compared  to  the  three-state  model  with  parameters  obtained  using  the 
EM  model.  The  initial  parameter  vectors  were  obtained  from  the  best  estimate  of  the  two-state  parameters  for 
the  data  being  fit  as  described  in  appendix  A.  The  time  histories  of  the  three-state  levels  of  significance  exhibited 
similar  properties  to  those  exhibited  for  the  two-state  time  histories  of  levels  of  significance  so  that  these  histories 
are  not  presented.  Instead,  we  focus  on  comparing  the  significant  levels  for  comparisons  of  the  data  with 
one-state,  two-state,  and  three-state  Gaussian  mixture  models  and  determining  the  L2  norms  of  the  differences 
between  the  one-state  and  the  two-state  fits,  the  one-state  and  the  three-state  fits,  and  the  two-state  and 
three-state  fits  to  the  data.  Recall  that  for  two  vectors  x  =  (xi ,X2,  and  jy  =  (yi,y2,  ...,yn),  me  L2  norm  of 

n 

the  difference  is  simply  the  Euclidean  distance  I  \x -y\ 1 2  =  ^(Xi  -y/)^  . 


F=l 


Histograms  of  the  significant  levels  for  the  Chi-squared  test  and  the  Kolomogorov-Smimov  one-sample  test 
comparing  the  empirical  data  with  one-state,  two-state,  and  three-state  Gaussian  mi/ture  models  are  presented 
in  figures  B-5  and  B-6,  respectively.  The  relationship  between  levels  of  significance  and  test  scores  differ  for  the 
three  cases  as  shown  in  figure  B-6.  The  different  relationships  occur  because  the  number  of  degrees  of  freedom 
for  the  Chi-squared  test  depends  on  the  number  of  states  for  the  mixture  model,  23  for  a  one-state  model,  21  for 
a  two-state  model,  and  19  for  a  three-state  model.  The  Chi-squared  test  significant  levels  tend  to  be  very  low 
(left-most  bin  shown  in  figure  B-5)  nearly  80%  of  the  time  for  a  one-state  model,  17%  of  the  time  for  a  two-state 
model,  and  1%  for  a  three-state  nradel.  Thus  either  a  two-state  or  three-state  model  fits  nearly  all  the  data  better 
than  a  one-state  as  already  discussed,  while  some  of  the  time  a  three-state  model  fits  the  data  better  than  either 
a  one-state  or  a  two-state  model.  Figure  B-7  presents  a  histogram  of  the  L2  norms  of  the  differences  between 
the  models  obtained  that  best  fit  the  data.  This  data  shows  that  more  than  80  percent  of  the  time,  the  two-state 
and  three-state  fits  lead  to  nearly  the  same  probability  density  functions. 

Figure  B-8  summarizes  the  probabilities  of  occurrence  of  the  three-state  model  parameters 

Off 

Pl,Pm,Ph,  — r,  obtained  using  the  EM  algorithm  for  the  hydrophone  data.  A  few  three-state  models  do 

not  predominate  the  three-state  models  that  best  fit  the  hydrophone  data,  unlike  for  the  two-state  models.  The 
surprising  result  is  the  high  percentage  of  cases  in  which  the  best  three-state  model  probability  was  between  0.4 
and  0.6  as  shown  in  figure  B-8b.  This  means  that  a  large  percentage  of  samples  formerly  in  the  low  and  high 
states  fall  into  the  middle  state,  while  the  middle-state  to  low-state  variance  clustered  about  3  and  the  high-state 
to  low-state  variance  about  6,  so  that  the  dynamic  range  as  measured  by  the  difference  between  the  variances  of 
the  low  and  high  states  did  not  dramatically  increase  when  the  data  was  fit  by  a  three-state  model  from  when  it 
was  fit  by  a  two-state  model. 
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Figure  B-4.  Joint  probability  density  function  of  low-state  probability  and  state  variance  ration  for 
simulated  hydrophone  data  near  a  convergence  zone. 
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Figure  B-S.  Mixture  model  Chi- 
squared  test  sigrtificance  levels  for 
h^rophone  data. 
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Figure  B-6. 1  Mixture  model  Kolmogorov- 
Smirnov  one-sample  test  significance  levels 
for  hydrophone  data. 
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Figure  B-7.  ■  Differences  between 
mixture  model  probability  densities 
for  hydrophone  data. 
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Figure  C~8.  Three-state  parameter  distributions 
for  hyorophone  data. 
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Figuiv  BS.  (Cont)  Three-state  parameter  distributions 
»  for  hydrophone  data. 


(b)  Middle-state  probability 

Figure  B-8.  (ConX)  Three-state  parameter  distributions 
for  hydrophone  data. 
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(d)  Middle-state  to  low-state  variance 

Figure  B-8.  (Cont)  Three-state  parameter  distrbutions  lor  hydrophone  data 
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